obesity and counties, respectively. Use the base plot function to check that counties includes the polygon elements you expect. Hint: reading in an RDS file from a website requires that you run the file through a decompressor before loading it via readRDS. R has a built-in decomopressor function called gzcon. (2 points)obesity <- read.csv("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/county_obesity_prevalence.csv", header = TRUE)
counties <- readRDS(gzcon(url("https://raw.githubusercontent.com/HimesGroup/BMIN503/master/DataFiles/uscounties_2010.rds")))
plot(st_geometry(counties))+ What were the 2004 and 2014 obesity rates for Orange County, California? For Orange County, Texas? Show all variables associated with these counties in the BRFSS and county polygons datasets. Aside from county names, what identifiers do these datasets share? *(2 points)*
obesity[obesity$county == "Orange County" & obesity$state == "California",]## state fips.code county age.adjusted.percent.2004
## 219 California 6059 Orange County 18.4
## age.adjusted.percent.2014
## 219 19.4
counties[counties$STATE == "6" & counties$COUNTY == "059", ]## Simple feature collection with 1 feature and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -118.1167 ymin: 33.38663 xmax: -117.413 ymax: 33.94641
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## GEO_ID STATE COUNTY NAME LSAD CENSUSAREA
## 683 0500000US06059 6 059 Orange County 790.568
## geometry
## 683 MULTIPOLYGON (((-118.093 33...
obesity[obesity$county == "Orange County" & obesity$state == "Texas",]## state fips.code county age.adjusted.percent.2004
## 2707 Texas 48361 Orange County 25.3
## age.adjusted.percent.2014
## 2707 29.2
counties[counties$STATE == "48" & counties$COUNTY == "361", ]## Simple feature collection with 1 feature and 6 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -94.1176 ymin: 29.91442 xmax: -93.69525 ymax: 30.24432
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## GEO_ID STATE COUNTY NAME LSAD CENSUSAREA
## 1194 0500000US48361 48 361 Orange County 333.669
## geometry
## 1194 MULTIPOLYGON (((-93.90145 3...
The obesity rate in Orange County, California was 18.4% in 2004 and 19.4% in 2014. The obesity rate in Orange County, Texas was 25.3% in 2004 and 29.2% in 2014. The fips.code variable in the BRFSS dataset match the STATE and COUNTY ID’s in the county polygons dataset.
+ Merge the two datasets so that `counties` contains state names and obesity rates for 2004 and 2014. *(3 points)*
library(tidyverse)
counties$fips.code <- paste0(counties$STATE, counties$COUNTY)
obesity$fips.code <- as.character(obesity$fips.code)
counties <- inner_join(counties, obesity, by = "fips.code")
head(counties)## Simple feature collection with 6 features and 11 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -88.13999 ymin: 31.19518 xmax: -85.12342 ymax: 34.8922
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## GEO_ID STATE COUNTY NAME LSAD CENSUSAREA fips.code state
## 1 0500000US01001 1 001 Autauga County 594.436 1001 Alabama
## 2 0500000US01009 1 009 Blount County 644.776 1009 Alabama
## 3 0500000US01017 1 017 Chambers County 596.531 1017 Alabama
## 4 0500000US01021 1 021 Chilton County 692.854 1021 Alabama
## 5 0500000US01033 1 033 Colbert County 592.619 1033 Alabama
## 6 0500000US01045 1 045 Dale County 561.150 1045 Alabama
## county age.adjusted.percent.2004 age.adjusted.percent.2014
## 1 Autauga County 29.8 33.8
## 2 Blount County 25.4 34.9
## 3 Chambers County 30.5 40
## 4 Chilton County 27.6 36
## 5 Colbert County 27.4 37.8
## 6 Dale County 28.8 36.6
## geometry
## 1 MULTIPOLYGON (((-86.49677 3...
## 2 MULTIPOLYGON (((-86.5778 33...
## 3 MULTIPOLYGON (((-85.18413 3...
## 4 MULTIPOLYGON (((-86.51734 3...
## 5 MULTIPOLYGON (((-88.13999 3...
## 6 MULTIPOLYGON (((-85.41644 3...
+ For each year (i.e., 2004 and 2014), create a static choropleth map of county-level obesity rates for the US using _ggplot2_. Add a title with `ggtitle`, remove county borders with `lwd=0` in the `geom_sf` call, and incorporate custom theme elements with the user-created `my_theme()` function. Some code to get you started with these maps is offered below. Feel free to change plot aesthetics or choose a different color palette. *(4 points)*
counties <- counties %>%
mutate(age.adjusted.percent.2004 = as.numeric(as.character(age.adjusted.percent.2004)),
age.adjusted.percent.2014 = as.numeric(as.character(age.adjusted.percent.2014)))
# Use a fixed color scale to more easily compare obesity rates between maps
prev_min <- min(c(counties$age.adjusted.percent.2004, counties$age.adjusted.percent.2014), na.rm = TRUE)
prev_max <- max(c(counties$age.adjusted.percent.2004, counties$age.adjusted.percent.2014), na.rm = TRUE)
library(RColorBrewer)
my_theme <- function() {
theme_minimal() +
theme(axis.line = element_blank(),
axis.text = element_blank(),
axis.title = element_blank(),
panel.grid = element_line(color = "white"),
legend.key.size = unit(0.8, "cm"),
legend.text = element_text(size = 16),
legend.title = element_text(size = 16),
plot.title = element_text(size = 22))
}
myPalette <- colorRampPalette(brewer.pal(9, "YlOrRd"))
ggplot() +
geom_sf(data = counties, aes(fill = age.adjusted.percent.2004), lwd = 0) +
my_theme() +
ggtitle("County-level obesity rates in 2004") +
scale_fill_gradientn(name = "Obesity rate (%)", colours = myPalette(100),
limit = range(prev_min, prev_max)) ggplot() +
geom_sf(data = counties, aes(fill = age.adjusted.percent.2014), lwd = 0) +
my_theme() +
ggtitle("County-level obesity rates in 2014") +
scale_fill_gradientn(name = "Obesity rate (%)", colours = myPalette(100),
limit = range(prev_min, prev_max)) + How did obesity rates in adults change between 2004 and 2014? (Qualitative answer is sufficient!) *(2 points)*
Obesity rates rose substantially in most parts of the country, with many counties having a percentage point increase of approximately 10.
counties that tracks the change in obesity rate for each county between 2004 and 2014. Be sure to code this variable so that a positive value indicates an increase in the prevalence of obesity. (1 point)counties$prevchange <- counties$age.adjusted.percent.2014 - counties$age.adjusted.percent.2004Create an interactive choropleth map using the leaflet library to visualize changes in county-level obesity rates between 2004 and 2014. Be sure to include a legend and scalebar for your map. The popup message should include the county name, state name, and change in obesity rate for the selected county. For example, the popup message for Philadelphia, Pennsylvania should read: > Philadelphia County, Pennsylvania > Change in obesity rate (2004-2014):
Hint: you can use the leaflet code from the practicum as a starting point. (5 points)
pal_fun <- colorBin(palette = brewer.pal(9, "RdBu")[c(1:5, 7)],
bins = c(-3, -1, 1, 5, 9, 13, 17), reverse = TRUE,
NULL)
pu_message <- paste0(counties$county, ", ", counties$state,
"<br>Change in obesity rate (2004-2014): ",
round(counties$prevchange, 1), "%")
leaflet(counties) %>%
addPolygons(stroke = FALSE,
fillColor = ~pal_fun(prevchange),
fillOpacity = 0.8, smoothFactor = 0.5,
popup = pu_message) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addLegend("bottomright",
pal=pal_fun,
values=~prevchange,
title = 'Change in obesity rate 2004-2014 (%)',
opacity = 1) %>%
addScaleBar()vars. You do not need to show a preview of vars. (1 point)vars <- load_variables(dataset = "acs5", year = 2016)Use get_acs from tidycensus to query a variable of your choice from the ACS 5-year estimates for 2012-2016 at the county level. For percentages, be sure to obtain both an estimate and a total. For full points, assign the ACS data you obtained to a data frame named acs.data. Make sure each county is represented by no more than a single row of acs.data and each column of acs.data contains no more than a single variable and has a meaningful name. We are only interested in mapping the estimate, so you can remove information about margin of error. Show the first six rows of acs.data. It is up to you which variable you would like to map. Possible options include:
Poverty rate (estimate: B17010_002, total: B17010_001)
Median household income (B19013_001)
Median house value (B25077_001) (4 points)
# Obtaining information about median house value
acs.data <- get_acs(geography = "county", variables = "B25077_001", year = 2016, survey = "acs5")
acs.data <- acs.data %>%
select(GEOID, NAME, estimate) %>%
rename(median_houseval = estimate)
head(acs.data)## # A tibble: 6 x 3
## GEOID NAME median_houseval
## <chr> <chr> <dbl>
## 1 01001 Autauga County, Alabama 141000
## 2 01003 Baldwin County, Alabama 173400
## 3 01005 Barbour County, Alabama 90300
## 4 01007 Bibb County, Alabama 97200
## 5 01009 Blount County, Alabama 124200
## 6 01011 Bullock County, Alabama 68000
counties with acs.data so that counties contains the ACS variables you have chosen to map. Show the first six rows of counties. (2 points)acs.data <- acs.data %>%
mutate(fips.code = as.character(as.numeric(as.character(GEOID))))
counties <- inner_join(counties, acs.data, by = "fips.code")
head(counties)## Simple feature collection with 6 features and 15 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -88.13999 ymin: 31.19518 xmax: -85.12342 ymax: 34.8922
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## GEO_ID STATE COUNTY NAME.x LSAD CENSUSAREA fips.code state
## 1 0500000US01001 1 001 Autauga County 594.436 1001 Alabama
## 2 0500000US01009 1 009 Blount County 644.776 1009 Alabama
## 3 0500000US01017 1 017 Chambers County 596.531 1017 Alabama
## 4 0500000US01021 1 021 Chilton County 692.854 1021 Alabama
## 5 0500000US01033 1 033 Colbert County 592.619 1033 Alabama
## 6 0500000US01045 1 045 Dale County 561.150 1045 Alabama
## county age.adjusted.percent.2004 age.adjusted.percent.2014
## 1 Autauga County 29.8 33.8
## 2 Blount County 25.4 34.9
## 3 Chambers County 30.5 40.0
## 4 Chilton County 27.6 36.0
## 5 Colbert County 27.4 37.8
## 6 Dale County 28.8 36.6
## prevchange GEOID NAME.y median_houseval
## 1 4.0 01001 Autauga County, Alabama 141000
## 2 9.5 01009 Blount County, Alabama 124200
## 3 9.5 01017 Chambers County, Alabama 83000
## 4 8.4 01021 Chilton County, Alabama 99100
## 5 10.4 01033 Colbert County, Alabama 102500
## 6 7.8 01045 Dale County, Alabama 103800
## geometry
## 1 MULTIPOLYGON (((-86.49677 3...
## 2 MULTIPOLYGON (((-86.5778 33...
## 3 MULTIPOLYGON (((-85.18413 3...
## 4 MULTIPOLYGON (((-86.51734 3...
## 5 MULTIPOLYGON (((-88.13999 3...
## 6 MULTIPOLYGON (((-85.41644 3...
library(leaflet)
pal_fun <- colorNumeric("BuPu", NULL)
pu_message <- paste0(counties$county, ", ", counties$state,
"<br>Median House Value: ",
"$", counties$median_houseval)
leaflet(counties) %>%
addPolygons(stroke = FALSE,
fillColor = ~pal_fun(median_houseval),
fillOpacity = 0.8, smoothFactor = 0.5,
popup = pu_message) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addLegend("bottomright",
pal=pal_fun,
values=~median_houseval,
title = 'Median House Value ($)',
opacity = 1) %>%
addScaleBar()Alternatively, mapping percent poverty…
acs.data2 <- get_acs(geography = "county", variables = c("B17010_001", "B17010_002"), year = 2016, survey = "acs5")
acs.data2 <- acs.data2 %>%
group_by(GEOID, NAME) %>%
summarise(num_poverty = estimate[variable == "B17010_002"],
num_total = estimate[variable == "B17010_001"]) %>%
mutate(percent_poverty = round(num_poverty/num_total*100, 1),
fips.code = as.character(as.numeric(as.character(GEOID))))
counties2 <- inner_join(counties, acs.data2, by = "fips.code")
pal_fun <- colorNumeric("BuPu", NULL)
pu_message <- paste0(counties2$county, ", ", counties2$state,
"<br>Poverty rate: ",
counties2$percent_poverty, "%")
leaflet(counties2) %>%
addPolygons(stroke = FALSE,
fillColor = ~pal_fun(percent_poverty),
fillOpacity = 0.8, smoothFactor = 0.5,
popup = pu_message) %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addLegend("bottomright",
pal=pal_fun,
values=~percent_poverty,
title = 'Poverty rate (%)',
opacity = 1) %>%
addScaleBar()Median house value: House values are highest along the coast of the California Bay Area.
Percent poverty: Poverty rates are highest in many counties of Louisiana and Missippi, as well as southern South Dakota and the southern tip of Texas.